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p I , Abstract 

Q*) . The multilevel Monte Carlo path simulation method introduced by Giles (Opera- 

I Hons Research, 56(3):607-617, 2008) exploits strong convergence properties to improve 

l^H ' the computational complexity by combining simulations with different levels of reso- 

qh. lution. Previous research has analysed its efficiency when using the Euler-Maruyama 

discretisation, and also demonstrated its improved efficiency using the Milstein discreti- 
sation with its improved strong convergence. In this paper we analyse its efficiency for 
'!> • scalar SDEs using the Milstein discretisation, bounding the order of convergence of 

the variance of the multilevel estimator, and hence determining the computational 
\^ ' complexity of the method. 
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: 1 Introduction 

In computational finance, Monte Carlo methods are used to estimate the expected 
value E[P] of a discounted payoff function which depends on the solution of an SDE of 
the generic form 

dS = a{S,t)dt + b{S,t)dW, 0<t<T, (1) 

subject to specified initial data 5(0) = Sq. 

Using a simple Monte Carlo method with a numerical discretisation with first order 
weak convergence, to achieve a root-mean-square error of 0{e) would require 0{e~'^) 
independent paths, each with 0{e~^) timesteps, giving a computational complexity 
which is 0{e~^). However, Giles introduced a new multilevel Monte Carlo (MLMC) 
approach [W\ [9] which reduces the cost to 0{e^'^) under certain circumstances. This 
multilevel approach, which is related to the two-level method of Kebaier |17j . and 
Heinrich's multilevel approach for parametric integration jl5j , combines the results of 
simulations with different numbers of timesteps. 
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The key identity underlying the method is 



E[Pl] = E[Po] + Y,nPi-Pi-i]- (2) 

This expresses the expectation on the finest level of resolution, using uniform 
timesteps, as the sum of the expected value on level 0, using just one timestep of 
size T, plus a sum of expected corrections between levels i and i — The quantity 
E[P£ — P£_i] can be estimated using Ni independent samples by 

1=1 

^(i) ^(i) 

Note that the difference P^ — P^_i comes from two discrete approximations with dif- 
ferent timesteps but the same Brownian path; this difference is small because of the 
strong convergence properties of the numerical discretisation. The variance of this 
simple estimator is V[l^] = N^'^Vi where is the variance of a single sample. It is 
the convergence of as £ — )• oo which is the focus of this paper, because of its central 
role in the following theorem [10] which comes from an optimal choice of the number 
of samples to be used on each level. 

Theorem 1.1 Let P denote a functional of the solution of stochastic differential equa- 
tion ([ip for a given Brownian path W , and let P^ denote the corresponding approxima- 
tion using a numerical discretisation with timestep hi = 2^^ T. 

If there exist independent estimators Yi based on Monte Carlo samples, and 
positive constants a > ^, /3, ci, C2, C3 such that 



i) E[P^-P] 

ii) n%] = 



< ci /i? 

E[Po], i = 

Hi) Y[Ye] < C2 N-^h^^ 

iv) Ci, the computational complexity ofYi, is bounded by 

Ci < C3Nihj\ 

then there exists a positive constant C4 such that for any e<e~^ there are values L and 
Ni for which the multilevel estimator 

L 

£=0 

has a mean- square- error with bound 



USE = E 



Y - E[P] 
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option 


E 

numerical 


;uler 
analysis 


M 

numerical 


ilstein 
analysis 


Lipschitz 

Asian 

lookback 

barrier 

digital 


0{h) 
0{h) 
0{h) 


0{h) 
0{h) 
0{h) 

0{h^''^ log h) 


0(/l3/2) 
0(/l3/2) 


0{h\\og hy) 



Table 1: Orders of convergence for as observed numerically and proved analytically for 
both the Euler and Milstein discretisations; 5 can be any strictly positive constant. 



with a computational complexity C with hound 

-2 /3>1 

C < <( 



C4 e 

C4e"2(loge)2, (3 = 1, 

2-{l-(S)/a^ 0</3<l. 



For the MLMC method based on the simple Euler discretisation, Giles, Higham and 
Mao [11] proved that Vi = 0{h) for European options (based on the final value of the 
underlying S{T)) with a uniform Lipschitz payoff, Asian options (based on the average 
value of the underlying) and lookback options (based on the minimum or maximum 
of the underlying). They also proved that Vi = o{h^^'^~^), for any 6 > 0, for barrier 
options (in which the payoff is zero if the underlying crosses, or fails to cross, a certain 
level) and digital options (for which the payoff is a discontinuous function of S{T)). The 
final result has been tightened by Avikainen [1] who proved that = 0{h^^'^ logh). 
As summarised in Table 1, numerical results |10) suggest that all of these results are 
near-optimal. 

For the MLMC method based on the Milstein discretisation, numerical results [9] 
suggest that Vi = 0{h'^) for European options with a uniform Lipschitz payoff and for 
Asian and lookback options, and Vi = 0{h^/'^) for the barrier and digital options. In 
this paper we aim to establish these orders of convergence analytically, and do so near 
optimally in each case. 



2 Previous results 

The analysis in this paper builds on a large body of results in the existing literature. 
They are included here for the sake of completeness. 

2.1 Solution of the SDE and its Milstein discretisation 

We will assume throughout this paper that the SDE ([1]) is scalar, and the drift function 
a G C2'i(MxM+) and volatility function h E C3'i(MxM+) satisfy the following standard 
conditions in which we use the notation Lq = d/dt + ad/dS and Li = bd/dS. 
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• Al (uniform Lipschitz condition): there exists Ki such that 

\a{x,t)-a{y,t)\ + \b{x,t)-b{y,t)\ + \Lib{x,t)-Lib{y,t)\ < Ki\x-y\ 

• A2 (hnear growth bound): there exists K2 such that 

|a(x,t)| + \Loa{x,t)\ + \Lia{x,t)\ + \b{x,t)\ + \LQb{x,t)\ 

+ \Lib{x,t)\ + \LoLib{x,t)\ + \LiLib{x,t)\ < K2{l + \x\ 

• A3 (additional Lipschitz condition): there exists K3 such that 



\bix,t)-bix,s)\ < K3il + \x\)^/\t-s\ 

Under these conditions, we have the fohowing result for the analytic solution to the 
SDE M- 



Theorem 2.1 Provided the assumptions A1-A3 are satisfied, then for all positive in- 
tegers m 



E 



sup |5(t)r 

.0<t<T 



< 00. 



With initial data 5*0 = S{0), the Milstein discretisation of equation ([T]) using a 
uniform timestep of size h is 

Sn+l = Sn + anh + bn AWn + \b'^bn {{AWnf " h) , (4) 

where b' = dbJdS, the subscript n denotes the timestep index, and a^, bn and 6^ are 
evaluated at Sn,tn- Kloeden and Platen [18] define a continuous time interpolant 



SKp{t) = Sn + an (t-tn) + 6„ {W{t)-Wn) + ^ 6^ 6„ {{W{t)-Wnf " (t-^n)) , (5) 

for tn<t<tn+i and prove the following result. 

Theorem 2.2 Provided the assumptions A1-A3 are satisfied, then for all positive in- 
tegers m there exists a constant Cm such that 



E 



sup \S{t) - SKp{t)\' 

0<t<T 



<Cmh'^, E 



sup \SKp{t)\" 

0<t<T 



<Cr, 



2.2 Brownian bridge results 

If the drift a and volatility are constant, the SDE ([T]) has solution 

S{t) = So + at + bW{t) 
and hence within the time interval of length h we have 

S{t) = Sn + X (Sn+l-Sn) + b {W{t) - Wn - X (Wn+l-Wn)) (6) 

where X = {t — tn)/h. This means that the deviation of S{t) from a piecewise linear 
interpolation of the values Sn = S{tn) is proportional to the deviation of W{t) from 
its piecewise linear interpolation. It can be proved that the distribution of the latter 
is independent of the Brownian increment Wn+i — Wn, and furthermore we have the 
following results (see for example '^^)- 
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Lemma 2.3 Conditional on Sn and Sn+i, the distribution for the integral of S{t) over 
the interval [tn,tn+i] is given by 



n + l 



S{t)dt=lh{Sn + Sn+l) + bIn (7) 



where 

rtn+i 



rtn + 1 

In= {W{t) -Wn-X (Wn+l-Wn)) dt 

J tn 

is a N{0, j2^^) Normal random variable, independent of Wn+i — Wn- 

Lemma 2.4 Conditional on Sn and Sn+i, the distributions for the minimum and 
maximum of S{t) over the interval [tm^n+i] are given by 

Sn,min = ^ (^Sn + Sn+1 — \J {Sn+l—Snf — 2b'^h log C/„ ^ , 
Sn,max = \ {^n + Sn+l + \J {Sn+l-Snf -2b'^h log K ^ , 

where Un and Vn are each uniformly distributed on the unit interval (0,1). 

Lemma 2.5 Provided b ^ 0, conditional on Sn and Sn+i, the probability that the 
minimum (or maximum) of S{t) over the interval [tn,tn+i] is less than (or greater 
than) some value B, is 

■ f ^ R I Q Q ^ ( -2{Sn-B) + {Sn+l-B)+ \ 

mf S{t) < B I Sn,Sn+i = exp -r- , (8) 



ltn,tn + l] J \ b^ h 

P ( Qm ^ R I Q Q ^ f -2iB-Sn) + {B-Sn+l)+ \ 

P sup S{t) > B \ Sn,Sn+i \ = exp -5- , (9) 

where the notation (x)^ means max(x,0). 

Corollary 2.6 IfW(t) is a Brownian motion with W{0) = W{1) = 0, then for x > 

F isupWit) > x] = F ( miW{t) < -x] = exp(-2x^), 
V[o,i] / Vio.i] J 



and hence E 



sup[o,i]iw^(t)r 



is finite for all positive integers m. 



2.3 Extreme values 

The following results come from extreme value theory which determine the limiting 
distribution of the maximum of a large set of i.i.d. random variables [7J. 
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Lemma 2.7 // C/„, n = 1, . . . , N are independent samples from a uniform distribution 
on the unit interval [0, 1], then for any positive integer m 



E 



max I log Un I 



0{{\ogNY 



as N 



oo. 



(10) 



Lemma 2.8 If Zn,n = 1,...,N are independent samples from a standard Normal 
distribution, then for any positive integer m 

= 0((logiV)™/2), asiV^oo. (11) 



E 



max I Zn I 

n 



Corollary 2.9 // Wn{t),n = 1, . . . ,N are independent Brownian paths on [0, 1], con- 
ditional on Wn{0) = Wnil) = 0, then for any positive integer m 



E 



max sup |iy„(t)|' 
" [0,1] 



0((logiV)"/2), asN 



oo. 



(12) 



Proof From Corollary 12. 6|. for sufficiently large x the tail probability for |VFn(t)| is 
less than that of a standard Normal random variable. 

2.4 Extreme paths 

Some of the proofs in fllj use an argument that certain "extreme" paths make a 
negligible contribution to the overall expectation. This same argument will be employed 
in this paper but in a more compact form based on these two lemmas. 

Lemma 2.10 // Xi is a scalar random variable defined on level i of the multilevel 
analysis, and for each positive integer m, E[|X^|™] is uniformly bounded, then, for any 
6>0, 

p(|x^| > =o(/i^), Vp>o. 

Proof Follows immediately from Markov's inequality 

IP (i^^i > v^) = IP {\xer > K""^) < iE[|x^r], 

by choosing m > p/5. 

Lemma 2.11 IfY^ is a scalar random variable on level I, E[Y^^] is uniformly bounded, 
and for each p > the indicator function Ie^ on level i (which takes value 1 or 
depending whether or not a path lies within some set E^) satisfies 

E[lsJ = o(/iP), 

then for each p > 0, 

E[|F^|li5j = o(/iP). 
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Proof Immediate consequence of Holder inequality which gives 



n\Yi\i 



^j<(E[y/])'/'(E[i^j) 



1/2 



In the proofs in the main analysis, Lemma I2.1UI will be used to establish the pre- 
conditions for Lemma 12.111 from which it can be concluded, by choosing p sufficiently 
large, that the contribution of the extreme paths is negligible compared to the paths 
that are not extreme. 



3 Analysis of the Milstein MLMC method 



3.1 Brownian interpolation 

In all of the cases to be analysed, the discrete paths are simulated using the Milstein 
method, with each level having twice as many timesteps as the previous level. This 
gives a set of values at discrete times, S{tn) where tn = nh. By approximating the drift 
and volatility as being constant within each timestep, we define the following Brownian 
interpolation based on equation ([6]), 



where A = (i — tn)/h. The advantage of this interpolation compared to the standard 
Kloeden-Platen interpolant is that we can use Lemmas 12.31 - 12.51 in constructing the 
multilevel estimators. The accuracy of the interpolant relative to the Kloeden-Platen 
interpolant is given by the following theorem: 



S{t) = Sn + X {Sn+l-Sn) + 6n (wit) - W„ - A (Wn+l-Wn) 



) 



(13) 




E sup S{t) - SKp{t) 

[0,T] 



m 



oiihioghD 



a) 



mi 



supE S{t)-SKp(t) 

[0,T] L 



iiij 
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Proof In each case we use the fact that E[max„ l^^&nl™'] is finite due to Theorem 
2.21 and Assumption A2. In addition, for t G the difference between the two 

interpolants is 



S{t)-SKpit) = ^b'^bnY{t), 



where 



Y{t) = X{Wn+l-Wn)^ -{W{t)-Wnf 

= \{l-X){Wn+i-Wnf -{W{t)-Wn-X{Wn+l-Wn)f 
- 2X{Wn+l-Wn) {W{t)-Wn-XiWn+l-Wn)) • 

i) Using Holder's inequahty, the assertion follows from 
S{t) - SKp{t) 



E 



sup 

[0,T] 



< 2" 



E 



max l^^^nP™ 

n 



E 



sup |y(i)|2'^ 
[o,r] 



together with bounds on E 
lary 



sup[o ■p] |y(t)p™' coming from Lemma [2.8l and Corol- 



ii) By setting W{t)-Wn = VXh Zi and Wn+i-W{t) = y/{l-X)h Z2, with Zi, Z2 
independent standard Normal random variables, one can prove that 



\Y\<hmax{Zt,Zi) =^ \Yr < /i™ max(Z^", < /i™ (Z^" + 

and hence the assertion follows from 



E 



Sit) - SKpit) = 2— E[ E[ \Y 



and standard bounds for moments of Normal random variables. 

'•tn + l 



rtn + l 

iii) Defining Xn '■= I Y{t) dt we obtain 

J tn 



E 









SKp{t)) d?j 


= lE 








\n=0 J 



For n>m, E[b'„^bmXmbnbnXn] = since Xn is independent of b'nfirnXmb'nbn and 
E[X„] = 0. In addition, the X^ are iid random variables, and therefore 



E 



(S{t) - SKp{t)) dt 







AT-l 



n=0 



The proof is completed by noting that E[Xq] = 0{h'^) due to standard moment 
bounds for Brownian increments. 
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3.2 Estimator construction 



For each Brownian input, the multilevel estimator ([3]) requires the calculation of the 
payoff difference P/ — Pg_i- Here P/ is a fine-path estimate using timestep h£ = 2~^T, 
and -P/_i is the corresponding coarse-path estimate using timestep h = 2^^^^^'^T. As 
explained in [9], to ensure that the identity ([2]) is correctly respected, it is required 
that 

E[P/_J=E[Pti]- (14) 
In the simplest case of a European option, this can be achieved very simply by defining 
P£_i and Pi_i to be the same. However, for the other applications the definition of 

P^_i involves information from the discrete simulation of P^ , which is not available 
in computing pI_i- This is done to reduce the variance of the estimator, but it must 
be shown that equality (|14p is satisfied. This will be achieved in each case through 
a construction based on the Brownian interpolant. In many cases this will involve 
evaluating the coarse path interpolant at the intermediate times for odd values of 
n, using the value for Wn which was used for the fine path. 

The analysis of the variance of the multilevel estimator will often use the following 
decomposition of the difference between the Brownian interpolants for the fine and 
coarse paths, 

Sf{t)-S\t) = (Sf{t)-S{,pit)) - iS%t)-S%p{t)) 

+ {Sip{t)-S{t)) - {S^^p{t)-S{t)) (15) 

with Theorem 13.1 1 bounding the error in the first two terms, and Theorem [2]2] bounding 
the error in the last two terms. 



3.3 Lipschitz payoffs 



Many European options, such as simple put and call options, have a payoff that is a 
Lipschitz function of the value of the underlying asset at maturity, 

p = /(5(r)). 

Discrete Asian options have a payoff which is a Lipschitz function of the value at 
maturity and the average of the underlying asset at a finite number of times T^ti, 

M 

S = M-'^S{T^). 

m=l 

Both of these are special cases of a more general class of Lipschitz payoffs in which 
the payoff is a Lipschitz function of the values of the underlying asset at a finite number 
of times T^, 

P = f{S{n),S{T2),...,S{TM)), 

with the Lipschitz bound 

M 

ffoCi) q(2) c.{2)x qW\ ^r\:^|c{2) q(1) 

m=l 
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for some constant L. In the numerical discretisation the fine and coarse path payoffs 
are both defined by 

P = f(S{T{),S{T2),...,S{TM)), 

with S{t) given by the Brownian interpolation. Note that this will require the additional 
simulation of W{Tjn) if does not correspond to one of the existing timesteps. 
We get the following result concerning the variance of the multilevel estimator: 

Theorem 3.2 This approximation for Lipschitz payoffs has Ve = 0{h'j). 

Proof From the Lipschitz bound and Jensen's inequality we obtain 

M 
m=l 

The decomposition (jlSp implies that 

E[(5^(r^)-s^(T™))2] < 4(E[(5^(r^)-5^p(r„))2]+ E[(5^(T^)-^^p(r„))2] 

+ E[{S{,p{T^)-S{T^)f]+E[{S'Kp{T^)-S{Tm)f]^ 
and the proof is completed using the results from Theorems 12.21 and 13.11 

3.4 Asian options 

Continuously monitored Asian options have a payoff that is a uniform Lipschitz func- 
tion of two arguments, the average over the time interval 

- r'^ 

S = T-^ / S{t) dt, 
Jo 

and the value at maturity, S{T). We now consider two alternative numerical approxi- 
mations. 

3.4.1 Treatment 1 

The first treatment is that used in [9], in which the fine and coarse path averages S 
are defined by integrating the interpolant (jl3p . Because of Lemma 12.31 this gives 

f-T N-i 

•^0 n=0 

where In are independent A^(0, j^^f ) variables. The payoff for the coarse path is defined 
similarly, but a straightforward calculation gives 

rtn+2 / f _f \ 

I'n = / [W{t)-Wn-—^{Wn+2-Wn))dt 
= ll+lUi-lhl{Wn+2-2Wn+l+Wn), 

and so is obtained from the Brownian path information used for the fine path. 
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Theorem 3.3 This approximation for continuous Asian payoffs has Vi = 0{hf). 
Proof Integrating (|15p gives 



+ nisiTp-sf] + nis^p-sf\ 

and the proof is completed using the Lipschitz bound and Theorems 12.21 and 13.11 
3.4.2 Treatment 2 

The second treatment is the same as the first except that it omits the terms In and 
and so the averages correspond to trapezoidal integration of the two interpolants, or 
ahernatively they can be viewed as averages of the piecewise linear interpolants SpL{t). 

Theorem 3.4 This approximation for continuous Asian options has Ve = 0{h'j). 

Proof The difference between the averages of the Brownian and piecewise linear 
interpolants is 

/ Sit) - SpL{t) At = V bnln. 
Since the /„ are iid A^(0, -^h^) variables, it follows that 

n 

and this is 0{h'j) due to the finite bound for E[max„ b'^]. 

Since 



sk - s'pL = is^-S'^) - isf-sk) + (S'^-s^^l) 

it follows that 



E[(5^L-^k)']<3 + E[iSf-Sk)'] + EiiS'^-Sf,^)^] 



The bounds on E,[{S—Spl)'^] together with the bound on K[{Sf — S'^)'^] from the proof 

of Theorem 13.31 prove that E[{Sk - Spi^)"^] = 0{h]), and the resuh then follows from 
the assumed Lipschitz property of the payoff. 

3.5 Lookback options 

In lookback options the payoff is a uniform Lipschitz function of the value of the un- 
derlying at maturity S{T), and either the minimum or the maximum of the underlying 
over the time interval. We will consider cases involving the minimum; the analysis for 
cases involving the maximum is very similar. 
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For the fine path simulation, we consider the conditional Brownian interpolation in 
the time interval [tn,tn+i] defined by 

P{t) = Sl + X {Sl^,-Sl) + bi {W{t) -Wn-X (T^„+i-T^„)) 



where X = {t — tn)/hi and bn = b{Sl,tn), and make use of Lemma to simulate the 
minimum on the time interval as 



(16) 



where C/„ is a uniform random variable on the unit interval. Taking the minimum over 

all timesteps gives the global minimum which is used to compute the fine path value 
pf 

For the coarse path value -P/_i) we do something slightly different. Using the same 
conditional Brownian interpolation, for even n we again use equation (I13p to define 
^n+v The minimum value over the interval [tn 5^11+2] can then be taken to be the 
smaller of the minima for the two intervals and [t„+i,t„+2]5 



QC 



qc _ cc 
'-'n 



qc 



n+2 



Sn+2-Sn+l] -2(&,^+i)2/l£logC/„+i 



(17) 



Here b'^ = b'^j^^ = b{S^,tn)- Note the re-use of the same uniform random numbers C/„ 
and Un+i used to compute the fine path minimum. Also, ^^^(S^ , ^j„) for 



level £ has exactly the same distribution as S*' 



/ 

n/2,min 



for level ^—1, since they are both 



based on the same approximate Brownian interpolation, and therefore equality (|14p is 
satisfied. 



Theorem 3.5 The multilevel approximation for a lookback option which is a uniform 
Lipschitz function of S{T) and inf[o_T] S{t) has Ve = 0{h'j{loghi)'^). 

Proof If SL^- and S!f^^^ are the computed minima for the fine and coarse paths, then 



qj _ qc 



< max 
n 



qj _ qc 

n,min '-'n,min 



< max 

n 



qf _ q^ 



+ max 

n 



dI-d: 



where 



S: 



n+l ■ 



.qf 



2{blYht\ogUn 



and is defined similarly. If Dn and are both zero, then D^\ =0. Oth- 



erwise, their sum is strictly positive and, using the inequality 



m - \y\ 



< \x - y\, 
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straightforward manipulations give 



Dl-R 



< 

< 
< 



(of C•/^^ ( QC Q 

Wn+l '-'"•J Wn+1 



c^2 
n7 



+ 



l(&^)'-(&n)'l he\logUn\ 



2{Di + 



I QC QC I 



+ 71 



I "ml 



^Jhi I logC/„| 



and hence 



sLin - Smir^ < 8 max (^S^ - + hi (max{bi - b^f^ (max \ log C/„ 



When n is even, assumption Al gives {bl - blf < K't [SL - 5^ ) , while for odd 
n we have 

< 2Kl (sl -Sl_,y + 2Kl {Si_, - S\ 



c 

n-l 



Now, 



Si - Sl_^, = an-ih + bn-iAWn-i + i6;_i6„-i((AW^„_i)2 - h, 



E 

from which it follows that 

E 

From Lemma 12.71 
and hence, 

E 



Asymptotically, the dominant term on the right is 6„_iAVF„_i, and it can be proved 
using the Jensen and Holder inequalities, the boundedness of E[max„ 6^] and Lemma 
11 that 

max{Sl - Sl_,f] =0{he\loghe\), 

n ;i. -I J 

max(fe^-6^)2l =0{hi\\oghi\). 

0{\\oghi\), 



E 



max I log Un I 

n 



Oihjiloghif), 



and the final result then follows from the uniform Lipschitz property of the payoff 
function and the bound 



maxE 

n 



0{hj). 
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3.6 Extreme paths 



The analysis of the variance of the multilevel estimators for barrier and digital options 
will follow the extreme path approach used in [TT|. We prepare for this with the 
following lemma in which we use the notation u -< /i" when u > and there exists a 
constant c > such that u < ch°', for sufhciently small h. Note that 



ui -< h 



U2 -< h 



Q-2 



Lemma 3.6 For any 7>0, the probability that a Brownian path W{t), its increments 
AWn = W{{n+l)h) — W{nh), and the corresponding SDE solution S{t) and its fine (h) 
and coarse (2h ) path approximations Sn and S"^ satisfy any of the following extreme 
conditions 



max ( max( 

n \ 



c(|S(n/i)|, \Sll 

max I max(|S'(n/i) — /S^l, \S{nh) — Sl\, \Sl^ — S^\) 

max I APV„ 

n 

sup Sf (t) - S{t) 

[0,T] 

sup \W{t)-W{t)\ 

[0,T] 



is o{h^) for all p>0. Here W{t) is defined to be the piecewise linear interpolant of the 
discrete values Wn- 



Furthermore, if none of these extreme conditions is satisfied, and j <\, th 



en 



max 5^ — 

n 


of 1 


-< 




27 


max bl^ — 

n 




-< 


h'l^- 


27 


maxmax( 6{ 

n 




-< 


h~^ 










max bl^ 


-K\ 


-< 




27 



(18) 
(19) 
(20) 
(21) 



where b'^ is defined to equal if n is odd. 



Proof The probability of the first two extreme conditions is o{hP) for all p>0 due to 
Theorems 12.11 and 12.21 and Lemma l2.1Ui Since 

P (raax\AWn\ > /i^/^"^) <^F (^\AWn\ > /i^/^"^) , 



the probability of the third is o{hP) for all p>0 due to Lemma |2.1U[ 

The fourth extreme condition has a o{hP) probability because Theorems 
13.11 together imply a uniform bound as /i — ?■ for 



and 



E 



[0,T] 
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for any m > 0. Similarly, the fifth is an extreme condition with o{hP) probability 
because of Corollary 12.61 

If none of the extreme conditions is satisfied, then using Assumption A2 gives 

and therefore ([TH]) is satisfied provided 7 < ^ so that h^/'^^'^^ is the dominant term in 
the above inequality. 

([19]) follows as a consequence because of Assumptions Al and A3, and (f20]) is 
obtained similarly from Assumption A2 and the bound on \Sn\ and \S^\. 

When n is even, the bound in (I2ip follows from Assumption Al and the bound on 
\Sn — S'f^\, while for odd n it requires the observation that 

and the bound then follows from (I19p and the corresponding bound for n — 1. 



3.7 Barrier options 

The barrier option which is considered is a down-and-out option for which the payoff is 
a Lipschitz function of the value of the underlying at maturity, provided the underlying 
has never dropped below a value B, 

p = /(5(r)) ir>T. 

Here lr>T is an indicator function taking value 1 if the argument is true, and zero 
otherwise, and the crossing time r is defined as 

r = inf {S(t) < B}. 
t>o ^ ^ ^ ' 

One approach would be to follow the lookback approximation in computing the 
minimum of both the fine and coarse paths. However, the variance would be larger 
in this case because the payoff is a discontinuous function of the minimum. A better 
treatment, which is the one used in [9], instead computes for each timestep the prob- 
ability that the minimum of the interpolant crosses the barrier, using the result from 
Lemma 12.51 This gives the conditional expectation for the payoff, conditional on the 
discrete Brownian increments of the fine path. For the fine path this gives 

N-l 

where 



n=0 



^/ ( -2{SL-B)+{SU^-B) 
pi. = exp 7 

The payoff for the coarse path is similarly defined as 

N-l 
n=0 
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where 



and for odd values of n, is defined by the usual interpolant and b'^ = b'^_i. 
Equality (dH) is satisfied in this case because 

u S^{t)>B I S^, S^^i, 

[tn,tn + 2\ 



inf S'{t)>B\S'^,S'^^,]=E 

ltn,tn + 2\ 



where the expectation on the r.h.s. is taken with respect to the distribution of the 
interpolated value 5*^+1, conditional on S^,S^^2- 

Theorem 3.7 Provided bmin = inf \biB,t)\ > 0, and inf S{t) has a bounded density 

[0,T] [0,T] 

in the neighbourhood of B, then the multilevel estimator for a down-and-out barrier 
option has variance = o{h?J'^ ^) for any 5>Q. 

Proof The proof involves dividing the paths into the following three subsets: 

(i) extreme paths; 

(ii) paths which are not extreme and for which \Smin — B \ > hy^ ^'^ for 0<7< |; 

(iii) the rest. 

Following the extreme path approach used in we start with 
V[P/-Pti] < IE[(P/-Pti)'] 

= E[(P/-Ptl)'l«] + 1E[(P/-Ptl)'l(n)] + E[(P/-Ptl)'l{m)] 

where the indicator functions have unit value for paths within the respective subsets. 
Each of these is considered in turn, and their contributions to E[{Pj^ — P^_-^^)'^] are 
bounded. 

(i) Paths are defined to be extreme if they satisfy any of the conditions of Lemma 
13.61 for < 7 < |. The Lipschitz bound for the payoff together with the bounds in 
Theorem 12.21 imply a uniform bound for E[(P^-^)^] and E[ (P^'^_^)'* ] and therefore also 
for E[(P/-P/_i)^]. Hence, by TheoremEIH E[{pI -P^_-^fl^,)] is for ah p>0. 

(ii) Suppose that S(t) attains its minimum at time r E [tn,tn+i]- 
First we consider the case Smin < B — h^/"^ . Starting with 

\Sl - < - Sf{T)\ + (r) - S{t)\, 

and noting that 

S^ir) -Sl= ^—^ (Si^, - SC) + hL{W{T) - W(r)), 
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we can conclude that \Sl — Smin\ ^ h^/'^ "^"^ . Hence, for sufficiently small hg, \Sn — 
Smin\ < h^"^ and so Sn is guaranteed to be less than B. In addition, Sl—S!^ < h\~^ 
and so, for sufficiently small /i^, is also guaranteed to be less than B and hence 
P/-Pti=0. 

In the alternate case Smin > B + h^"^ ^'^ , then 

minmin(5i{, S'J > B + h]'^'^^ - h]'^ 

n 

and since h\ ^ /i]^^ '^'^ it follows that Wn{^~Pn) and nn(-'-~-Pri) ^'^^ both equal to 
1 — oQi^f) for all p>0, and so |-P/— -P/_]^| -< h\~"' due to the Lipschitz condition and the 
bound on Sj^ — Sf^. Hence, the contribution to E[(P/ — P^'i-^)^] is at most 0{h^^~'^'^). 

(iii) Our first step is to note that if any one of Sn, Sl^_^i, S^, S'f^^i is greater than 
B + h^"^ , then the others will be greater than B + ^/i^^ , when hi is sufficiently 
small, since \Sl — S^^-^l ~< h^"^ and max(|S'j{ — l^^+i — S'n+il) — ^\~^ ■ ^^^^ 
case, pn and will both be o{h^), and so 



l[{i-K)=U^^-K) + o{h^,), 

;he set of indi 

B + h'/'-'^ 

Assume n £ R. We have Sn — < h}^ and Sl_^i — 5"^+! < fi^ due to the 
definition of extreme paths, and bn — b'^ -< hV"^ '^'^ , due to Lemma ESI If we now define 



n n£R 

and 

n n£R 

where R is the set of indices n for which none of Sn, S^j^i, Sn, S^_^_i is greater than 



xl = 



2(sL-Br(sU,-B) 

(biy w 

2(S^-i?)+(5;^+i-i?) 



f ■ 
then when Xn and are both strictly positive it follows, through the continuity of 

b{S,t) and for sufficiently small /i^, that min(|65^|, > \ bjnin, and hence through 

repeated use of the following identity, 

hgi -f2 92 = Ufi-h){9i+92) + \{h+f2){gi-g2), (23) 



and the fact that n £ Rto bound terms such as Sn—B, we obtain 



Xn — X, 



and hence 



xl - X^ 



< ^'^ , for sufficiently small h£. The same bound can also be 
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achieved in the other cases in which at least one of X„ and is equal to zero. If we 
define 



then we obtain 



= (l-p/)+p/(l-exp(X^-X^) 



Since 5(A) = [J [(l-pl) + pi Aj - Y[{1- p^) - A is convex, g{0) = and 

neB. n£R 

g{l) = -l\^^Ril-pi) < 0, we conclude that 5'(A)<0, VA G [0,1]. Hence, 



n&R 



Similarly, 1 — < (1 — p^) + p!^A^, which leads to 



riGR 



nGR 



and therefore 



Returning to the original products over all n. 



< Ae. 



ll(^-Pn)-U(^-K) 



-< h 



1/2-57 



This gives us P/ — -< h\^'^ , because of the bound on f{Sj^) and f{S'j^), and so 
the contribution from set (iii) to E[{Pj^ — P^_^)'^] is at most 0{h^^'^~^^'^). 

The proof is finally completed by choosing 7 < min(|,5/16). 

3.8 Digital options 

A digital option has a payoff which is a discontinuous function of the value of the 
underlying asset at maturity, the simplest example being 



P 



which has a unit payoff iff S{T) is greater than the strike K. 

The difficulty with the digital option is that the approach used in section 13.31 will 
lead to an 0{hi) fraction of the paths having coarse and fine path approximations to 
S{T) on either side of the strike, producing P/ — P^_i = ±1, resulting in = 0{hi). 
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To improve the variance to 0{h^ ) for all 6>0 we follow the approach which was 
tested numerically in [9], using the technique of conditional expectation (see section 
7.2.3 in dU). 

If ^N-i denotes the value of the fine path approximation one timestep before ma- 
turity, then if we approximate the motion thereafter as a simple Brownian motion 
with constant drift a^j^_^ = a{Slf_^,T — hi) and volatility b-^^_^ = b{Sj^__^,T — hi), the 

conditional expectation for the payoff is the probability that Sj^ > K after one further 
timestep, which is 

P/ = J^^^i±^i-^V (24) 

where $ is the cumulative Normal distribution. 

For the coarse path, we note that given the Brownian increment AM^jv_2 for the 
first half of the last coarse timestep (which comes from the fine path simulation), the 
probability that 5^ > K is 

pc ^ ^ ( S%.2 + '^a%.2ht+b%_^^WN-2 - k \ ^^^^ 

The conditional expectation of ()25p is equal to the conditional expectation of Pi_i 
defined by (p^ on level ^—1, and so equality (fH|) is satisfied. 

A bound on the variance of the multilevel estimator is given by the following result: 

Theorem 3.8 Provided b{K, T) ^ 0, and S{t) has a bounded density in the neighbour- 
hood of K , then the multilevel estimator for a digital option has variance = o{h^ ) 
for any 5>0. 

Proof As in the proof of Theorem 13.71 we split the paths into three subsets: 

(i) extreme paths; 

(ii) paths which are not extreme and for which \S]\f — K \ > h^J"^ ^'^ ] 

(iii) the rest. 

and we analyse the contributions to 'Ej[{Pj^ —P^_t^'^\ from all three subsets. 

(i) Paths are defined to be extreme if they satisfy any of the conditions of Lemma 
[Mlfor < 7 < i. E[(P-^)4] and E[(P'=)^] are both finite, and hence the contribution 
of the extreme paths is o(/i^), for all p > 0. 

(ii) If we define and S*^ to be the values which we would have obtained from 
the fine and coarse path simulations after the final timestep, then 

Sif_i+aif_ihg - K 
l^^_il 

Sl-K bf 
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and similarly 



S'L_,+2a'ir_,h^+b''^_,AWN^2 - K 



cc 



r7V-2l 

K 



"N-2 



\"N-2\ 



l"Ar-2l 



AWn^i + W)'n-2 ( {AWn-2 + AW, 



N-li 



2h, 



and \S{T)—Slq\ < h\ , and due 



Since the paths are not extreme, |A1/F„| < anu. }~'~>n 

to Lemma [3^ | htr ^ \ -< hj^ . Consequently, if S{T) > K-\-h\^'^ ^'^ 
small hi it follows that 



then for sufficiently 



^7 



> Ch 



-27 



for some suitably chosen constant C. A similar result follows for the corresponding 
coarse path, and hence for these paths P[ — P^_i = o{hF^), for all p > 0. A similar 

argument applies to the other paths for which S{T) < K — /i^^ ^'^ , and hence IE[(P/ — 
is ©(/i^) for all p>0 and so this contribution is also negligible. 

(iii) This subset consists of non-extreme paths for which \S{T) — K\ < /i^^ '^'^ . 
Since 

- b{K,T) = + (&^-6(i^,r)), 

using Assumption Al and with 7 < | we can conclude that for sufficiently small 

I^TV-i ~ ^(-^1^)1 < \ ^-iid in particular h^]^_^ is non-zero and of the same 

sign as b{K,T). The same also applies to b'j^_2 and hence, exploiting the Lipschitz 
property |'&(a;i) — <I>(x2)| < l^i— X2I, 



Pi - PL 



< 



< 



V -\-n-^ I 
^N-l^"'N-l' 



K 



S%_o + 2a%_^hi+h'i._2AWN-2 - K 



of 



uf I 
-K 



^N-2\ 



'hp 



S%-K 



+ \KihJ^'^ {(ATy^v-i)^ + {AWn-2 + I^Wn-i? + 3/1^ . 
Using the identity ([23|) we obtain 



of 



K 



qc 



K 



1 



^N-2\ 



+ 



2^hi 
1 



1 



N) 



+ 



1 



{Sf^ + S%-2K) 



'N~2\ 



d I 



2Vhi 

Using the bounds provided by Lemma [HTUl it follows that 



\"N-1\ \"N-2\ 



of 



K 



5' 



N 



K 



uf I 



'N-2\ 



0{h 



1/2-57N 
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and hence P/ — -P£_i = ©(/i^^ ^^). Since E[l(jjj)] = 0{hy^ '^'^) due to the bounded 
probabihty density for S{T), it follows that E[(P/-P/_ J^i^...^] = 0(hf^'^^^). Choos- 
ing 7 < min(i,5/13) completes the proof. 

4 Conclusions and future work 

In this paper we have proved that when using the Milstein discretisation for a scalar 
SDE the variance of the multilevel estimator is 0{h'j) for Lipschitz and Asian options, 

0{h'j{loghi)'^) for lookback options, and o{h^J'^ ^) for barrier and digital options, for 
any 5 > 0. 

Condition i) of Theorem 11.11 requires knowledge of the order of weak convergence. 
Theorems 12.21 and 13.11 together give 0{h) weak convergence for the Lipschitz and Asian 
options, and 0(/ilog h) convergence for the lookback option. For the digital and barrier 
options, the analysis of the multilevel convergence can be modified to instead consider 
E[P^ — P], and hence it can be proved that the weak order of convergence is o{h}-~^^ 
for any 5 > 0. In all cases, the weak convergence rate satisfies the inequality a > ^ 
required by Theorem II .11 and so it can be concluded that the computational cost to 
achieve a r.m.s. accuracy of e is ©(e"^) for all of the cases considered in this paper. 

There are lots of directions for future research. One is the extension to jump- 
diffusion [22] and exponential Levy processes [2T], to complement existing analyses for 
Levy-driven SDEs [HI [5l |6] . Others include the analysis of MLMC applied to the com- 
putation of sensitivities, usually referred to as "Greeks" [31 [2j, and also stopping times 
|20j . For path-dependent options there are possibilities of improving the order of con- 
vergence of the multilevel variance using adaptive algorithms [16], and the complexity 
can also be improved using quasi- Monte Carlo sampling |131 [8] 

The most natural direction in which to extend the analysis in this paper is to multi- 
dimensional SDEs. In cases in which a certain commutativity condition is satisfied 
(see, for example, page 353 in |14j ) there is a simple generalisation of the Milstein 
discretisation which still requires only the increments of the driving Brownian motions. 
In this case, most of the analysis in this paper will extend quite naturally. The only 
problem is that if Wi{t) and W2{t) are two correlated Brownian motions, then we do not 
have analytic expressions for the joint distribution of their minima and maxima over a 
given interval, conditional on the end values. This will cause problems in constructing 
the lookback and barrier estimators if the payoff depends on more than one minimum 
or maximum. 

Clark & Cameron [3] proved that in general it is not possible to achieve first order 
strong convergence for multi-dimensional SDEs using just the Brownian increments. 
If the commutativity condition is not satisfied, then the multi-dimensional Milstein 
discretisation requires the simulation of iterated Ito integrals known as Levy areas to 
achieve 0{h) strong convergence. If these terms are omitted, the strong convergence 
remains 0{h^'^), but a new paper [12] proves that it is still possible to achieve a good 
multilevel estimator through the use of antithetic variates. The variance convergence is 
0{h?) for smooth European (and Asian) payoffs, and 0{h^^'^) for Lipschitz European 
(and Asian) payoffs such as put and call options which are smooth almost everywhere. 
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Extending this analysis to digital, lookback and barrier options will be a challenge for 
the future. The optimal treatment is likely to require sub-sampling of the Brownian 
paths within each timestep to approximate the Levy areas, with the level of sub- 
sampling being a tradeoff between the cost and accuracy of the simulated Levy areas. 

Acknowledgements: The first author is grateful to Sylvestre Burgos, Lukasz 
Szpruch and Yuan Xia for their comments on the paper. 
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